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Abstract — We study the problem of reconstructing a sparse 
signal from a limited number of its linear projections when a part 
of its support is known, although the known part may contain 
some errors. The "known" part of the support, denoted T, may 
be available from prior knowledge. Alternatively, in a problem 
of recursively reconstructing time sequences of sparse spatial 
signals, one may use the support estimate from the previous time 
instant as the "known" part. The idea of our proposed solution 
(modified-CS) is to solve a convex relaxation of the following 
problem: find the signal that satisfies the data constraint and is 
sparsest outside of T. We obtain sufficient conditions for exact 
reconstruction using modified-CS. These are much weaker than 
those needed for compressive sensing (CS) when the sizes of the 
unknown part of the support and of errors in the known part 
are small compared to the support size. An important extension 
called Regularized Modified-CS (RegModCS) is developed which 
also uses prior signal estimate knowledge. Simulation compar- 
isons for both sparse and compressible signals are shown. 

I. Introduction 

In this work, we study the sparse reconstruction problem 
from noiseless measurements when a part of the support is 
known, although the known part may contain some errors. 
The "known" part of the support may be available from prior 
knowledge. For example, consider MR image reconstruction 
using the 2D discrete wavelet transform (DWT) as the sparsi- 
fying basis. If it is known that an image has no (or very little) 
black background, all (or most) approximation coefficients 
will be nonzero. In this case, the "known support" is the 
set of indices of the approximation coefficients. Alternatively, 
in a problem of recursively reconstructing time sequences 
of sparse spatial signals, one may use the support estimate 
from the previous time instant as the "known support". This 
latter problem occurs in various practical applications such as 
real-time dynamic MRI reconstruction, real-time single-pixel 
camera video imaging or video compression/decompression. 
There are also numerous other potential applications where 
sparse reconstruction for time sequences of signals/images 
may be needed, e.g. see 0, J4). 

Sparse reconstruction has been well studied for a while, e.g. 
see |]3, ©■ Recent work on Compressed Sensing (CS) gives 
conditions for its exact reconstruction Q, 0, J9) and bounds 
the error when this is not possible |[T0l . ifTTl . 

Our recent work on Least Squares CS-residual (LS-CS) 
lfl2l . Ifl3l can be interpreted as a solution to the problem 
of sparse reconstruction with partly known support. LS-CS 
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(a) Top: larynx image sequence, Bottom: cardiac sequence 
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(b) Slow support change plots. Left: additions, Right: removals 
Fig. 1. In Fig. |l(a)| we show two medical image sequences. 
In Fig. |l(b)[ N t refers to the 99% energy support of the two- 
level Daubechies-4 2D discrete wavelet transform (DWT) of these 
sequences. \Nt\ varied between 4121-4183 (« 0.07m) for larynx 
and between 1108-1127 (~ 0.06m) for cardiac. We plot the number 
of additions (left) and the number of removals (right) as a fraction 
of | TVt | . Notice that all changes are less than 2% of the support size. 

replaces CS on the observation by CS on the LS observation 
residual, computed using the "known" part of the support. 
Since the observation residual measures the signal residual 
which has much fewer large nonzero components, LS-CS 
greatly improves reconstruction error when fewer measure- 
ments are available. But the exact sparsity size (total number of 
nonzero components) of the signal residual is equal to or larger 
than that of the signal. Since the number of measurements 
required for exact reconstruction is governed by the exact 
sparsity size, LS-CS is not able to achieve exact reconstruction 
using fewer noiseless measurements than those needed by CS. 

Exact reconstruction using fewer noiseless measurements 
than those needed for CS is the focus of the current work. 
Denote the "known" part of the support by T. Our proposed 
solution (modified-CS) solves an i\ relaxation of the following 
problem: find the signal that satisfies the data constraint and is 
sparsest outside of T. We derive sufficient conditions for exact 
reconstruction using modified-CS. When T is a fairly accurate 
estimate of the true support, these are much weaker than the 
sufficient conditions for CS. For a recursive time sequence 
reconstruction problem, this holds if the reconstruction at 
t = is exact and the support changes slowly over time. 
The former can be ensured by using more measurements at 
t = 0, while the latter is often true in practice, e.g. see Fig. Q] 

We also develop an important extension called Regularized 
Modified-CS which also uses prior signal estimate knowledge. 
It improves the error when exact reconstruction is not possible. 
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A shorter version of this work first appeared in ISIT'09 
H). In parallel and independent work in 03), Khajehnejad 
et al have also studied a similar problem to ours but they 
assume a probabilistic prior on the support. Other related work 
includes fl31 . Very recent work on causal reconstruction of 
time sequences includes lfl6l (focusses on the time-invariant 
support case) and flTTl (use past estimates to only speed up the 
current optimization but not to improve reconstruction error). 
Except M4V , none of these prove exact reconstruction using 
fewer measurements and except 4751/ . HI 4H , none of these even 
demonstrate it. 

Other recent work, e.g. [181 . applies CS on observation 
differences to reconstruct the difference signal. While their 
goal is to only estimate the difference signal, the approach 
could be easily modified to also reconstruct the actual signal 
sequence (we refer to this as CS-diff). But, since all nonzero 
coefficients of a sparse signal in any sparsity basis will 
typically change over time, though gradually, and some new 
elements will become nonzero, thus the exact sparsity size of 
the signal difference will also be equal to/larger than that of 
the signal itself. As a result CS-diff will also not achieve exact 
reconstruction using fewer measurements, e.g. see Figj3] 

In this work, whenever we use the term CS, we are actually 
referring to basis pursuit (BP) Q. As pointed out by an 
anonymous reviewer, modified-CS is a misnomer and a more 
appropriate name for our approach should be modified-BP . 

As pointed out by an anonymous reviewer, modified-CS 
can be used in conjunction with multiscale CS for video 
compression |fl9l to improve their compression ratios. 

The paper is organized as follows. We give the notation and 
problem definition below. Modified-CS is developed in Sec. 
HT1 We obtain sufficient conditions for exact reconstruction 
using it in Sec. [Til] In Sec. |IV] we compare these with the 
corresponding conditions for CS and we also do a Monte 
Carlo comparison of modified-CS and CS. We discuss Dy- 
namic Modified-CS and Regularized Modified CS in Sec. [V] 
Comparisons for actual images and image sequences are given 
in Sec. IVII and conclusions and future work in Sec. I VII I 

A. Notation 

We use ' for transpose. The notation ||c||fc denotes the Ik 
norm of the vector c. The lo pseudo-norm, ||c||o, counts the 
number of nonzero elements in c. For a matrix, M, ||A/| 
denotes its induced i 2 norm, i.e. ||M|| := max c .|| c || 2=1 || Af c|| 2- 

We use the notation At to denote the sub-matrix containing 
the columns of A with indices belonging to T. For a vector, the 
notation (/3)t (or /3t) refers to a sub-vector that contains the 
elements with indices in T. The notation, [1, n] := [1, 2, . . . n]. 
We use T c to denote the complement of the set T w.r.t. [1, n], 
i.e. T c := [1, n]\T. The set operations, U, n stand for set union 
and intersection respectively. Also T\ \ T2 := T\ DTg denotes 
set difference. For a set T, \T\ denotes its size (cardinality). 
But for a scalar, b, \b\ denotes the magnitude of b. 

The S'-restricted isometry constant (9], Sg, for a matrix, A, 
is defined as the smallest real number satisfying 

(i-fe)iiciii<m TC |n<(i+^)i| C ||= (i) 



for all subsets Tc [T 71 ] °f cardinality |T| < S and all real 
vectors c of length \T\. The restricted orthogonality constant 
H), 0Si,s 2 > i s defined as the smallest real number satisfying 

\ci A Tl ' A T2 c 2 \ < 6> Sl :J s- 2 1| ci || 2 1! c 2 1| 2 (2) 

for all disjoint sets T X ,T 2 C [l,n] with |2\| < Si, \T 2 \ < S 2 
and Si + S 2 < n, and for all vectors ci, c 2 of length |Ti|, 
\T 2 \ respectively. By setting ci = At/ At 2 c 2 in (fJJ, 

\\A t ,'At 2 \\<0 Su s 2 (3) 

The notation X ~ AT(/i, E) means that X is Gaussian 
distributed with mean /1 and covariance E while Af(x; fi, E) 
denotes the value of the Gaussian PDF computed at point x. 

B. Problem Definition 

We measure an m-length vector y where 

y := Ax (4) 

We need to estimate x which is a sparse n-length vector with 
n > m. The support of x, denoted TV, can be split as N = 
T U A \ A e where T is the "known" part of the support, 
A e := T\N is the error in the the known part and A := N\T 
is the unknown part. Thus, A e C T, A, T are disjoint and 
|AT| = |T| + |A|-|A e |. 

We use s := \N\ to denote the size of the (s)upport, k := |T| 
to denote the size of the (k)nown part of the support, e = |A e | 
to denote the size of the (e)rror in the known part and u = |A| 
to denote the size of the (u)nknown part of the support. 

We assume that A satisfies the S'-restricted isometry prop- 
erty (RIP) 13 for S = (s + e + u) = [k + 2u). S-RIP means 
that 8s < 1 where 5s is the RIP constant for A defined in (Q]). 

In a static problem, T is available from prior knowledge. For 
example, in the MRI problem described in the introduction, 
let N be the (unknown) set of all DWT coefficients with 
magnitude above a certain zeroing threshold. Assume that the 
smaller coefficients are set to zero. Prior knowledge tells us 
that most image intensities are nonzero and so the approxima- 
tion coefficients are mostly nonzero. Thus we can let T be the 
(known) set of indices of all the approximation coefficients. 
The (unknown) set of indices of the approximation coefficients 
which are zero form A e . The (unknown) set of indices of the 
nonzero detail coefficients form A. 

For the time series problem, y = yt and x = xt with 
support, N t = T U A \ A e , and T = N t _i is the support 
estimate from the previous time instant. If exact reconstruction 
occurs at t— 1, T = N t -i- In this case, A e = N t -i\N t is the 
set of indices of elements that were nonzero at t — 1, but are 
now zero (deletions) while A = N t \N t _i is the newly added 
coefficients at t (additions). Slow sparsity pattern change over 
time, e.g. see Fig. Q] then implies that u = |A| and e= |A e | 
are much smaller than s = |7V|. 

When exact reconstruction does not occur, A e includes both 
the current deletions and the extras from t — 1, Nt—i \ Nt—i. 
Similarly, A includes both the current additions and the misses 
from t — 1, N t _i \ Nt—i. In this case, slow support change, 
along with Nt-i ~ ^t-i, still implies that u <^ s and e <C s. 
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II. Modified Compressive Sensing (modified-CS) 

Our goal is to find a signal that satisfies the data constraint 
given in © and whose support contains the smallest number 
of new additions to T, although it may or may not contain all 
elements of T. In other words, we would like to solve 

min ||(/3)t c ||o subject to y = A/3 (5) 

If A e is empty, i.e. if N — T U A, then the solution of ^ is 
also the sparsest solution whose support contains T. 

As is well known, minimizing the £q norm is a combina- 
torial optimization problem [20). We propose to use the same 
trick that resulted in CS BI 1171. M. flOl. We replace the £ 
norm by the l\ norm, which is the closest norm to £q that 
makes the optimization problem convex, i.e. we solve 

mm || (/3)t c || i subject to y = A/3 (6) 

Denote its output by x. If needed, the support can be estimated 
as 

N:= {i€ [l,n] : (£)? > a} (7) 

where a > is a zeroing threshold. If exact reconstruction 
occurs, a can be zero. We discuss threshold setting for cases 
where exact reconstruction does not occur in Sec. IV-AI 

III. Exact Reconstruction Result 

We first analyze the £q version of modified-CS in Sec. 
IIII-AI We then give the exact reconstruction result for the 
actual l\ problem in Sec. IIII-BI In Sec. IIII-CI we give the 
two key lemmas that lead to its proof and we explain how 
they lead to the proof. The complete proof is given in the 
Appendix. The proof of the lemmas is given in Sec. IIII-DI 

Recall that k = \T\, u = |A|, e = |A e | and s = \N\. 

A. Exact Reconstruction Result: £q version of modified-CS 

Consider the £q problem, ©. Using a rank argument similar 
to (9l Lemma 1.2] we can show the following. The proof is 
given in the Appendix. 

Proposition 1: Given a sparse vector, x, with support, N = 
TU A\ A e , where A and T are disjoint and A e C T. Consider 
reconstructing it from y := Ax by solving (0. x is the unique 
minimizer of © if Sk+ 2u < 1 (A satisfies the (k + 2u)-RIP). 

Using k = s + e — u, this is equivalent to 5 s +e+u < 1. 
Compare this with [9, Lemma 1.2] for the £0 version of CS. 
It requires <52 S < 1 which is much stronger when u -C s and 
e -C s, as is true for time series problems. 

B. Exact Reconstruction Result: modified-CS 

Of course we do not solve © but its l\ relaxation, (O. 
Just like in CS, the sufficient conditions for this to give 
exact reconstruction will be slightly stronger. In the next few 
subsections, we prove the following result. 

Theorem 1 (Exact Reconstruction): Given a sparse vector, 
x, whose support, N = T U A \ A e , where A and T are 



disjoint and A e C T. Consider reconstructing it from y := Ax 
by solving (|6). x is the unique minimizer of (O if 

1) 4+u < 1 and § 2u + 5 k + 6 2 k 2u < 1 and 

2) afc(2u, u) + ak{u 1 u) < 1 where 

Q _ 1 S,k e S,k 

a k (S,S):=^ tz5s- (8) 



The above conditions can be rewritten using k = s + e — u. 

To understand the second condition better and relate it to 
the corresponding CS result, let us simplify it. ak(2u,u) + 

ak(u,u) < -p b — . Simplifying further, a suffi- 

cient condition for afe(2u, u) + Qk{u, u) < 1 is Q Uj 2u + &u,u + 
^ ^ 2u ^ F urtner ' a sufficient condition for this 

is u ,u + S 2u + 0u,2u + S k + 2 u k + W\ u k < 1. 

To get a condition only in terms of 5s's, use the fact that 
S — $S+S 0- A sufficient condition is 2<$2u + + ^fe + 

^k+u + ^t+2u < 1- Further, notice that if u < k and if 

Sk+2u < l/5,then2S 2 u+S3u+Sk+S 2 k+u +2Sl +2u < 45 fe+2 „ + 

4+2n(34+2„) < (4 + 3/5)4+2n < 23/25 < 1. 

Corollary 1 (Exact Reconstruction): Given a sparse vector, 
x, whose support, N = T U A \ A e , where A and T are 
disjoint and A e C T. Consider reconstructing it from y := Ax 
by solving ©. 

• x is the unique minimizer of (O if Sk+ U < 1 and 

(s 2u + e u , u + e u , 2u ) + (s k + e 2 ku + 26 2 au ) < 1 (9) 

• This, in turn, holds if 

2S 2u + S 3u + S k + S 2 k+U + 25 2 k+2u < 1. 

• This, in turn, holds if u < k and 

Sk+ 2 u < 1/5. 

These conditions can be rewritten by substituting k = s+e—u. 
Compare (O to the sufficient condition for CS given in (9): 

S 2s + s , s + 9 s , 2s < 1 (10) 

As shown in Fig. Q] usually u <C s, e -c s and u w e 
(which means that k w s). Consider the case when the number 
of measurements, m, is smaller than what is needed for exact 
reconstruction for a given support size, s, but is large enough 
to ensure that 6k. 2u < 1/2. Under these assumptions, compare 
(O with ( [Tol l. Notice that (a) the first bracket of the left hand 
side (LHS) of (O will be small compared to the LHS of ( fTOb . 
The same will hold for the second and third terms of its second 
bracket compared with the second and third terms of ( fTOb . The 
first term of its second bracket, Sk, will be smaller than the 
first term of ( fTOl , S 2s . Thus, for a certain range of values of 
to, the LHS of (0 will be smaller than that of ( fTOl and it may 
happen that (0 holds, but ( fTOl does not hold. For example, if 
to < 2s, (fTOji will not hold, but if s + u + e < to < 2s, © 
can hold if u, e are small enough. A detailed comparison is 
done in Sec. [IV] 
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C. Proof of Theorem [7} Main Lemmas and Proof Outline 

The idea of the proof is motivated by that of (9] Theorem 
1.3]. Suppose that we want to minimize a convex function 
J (fl) subject to A/3 = y and that J is differentiable. The 
Lagrange multiplier optimality condition requires that there 
exists a Lagrange multiplier, w, s.t. VJ((3) — A'w = 0. Thus 
for i to be a solution we need A'w = VJ(x). In our case, 

J(x) = IMi=£ 



. Thus (VJ(x))j = for j e T 
and (VJ(x))j = sgn(xj) for j G A. For j'^TuA, x j = 0. 
Since J is not differentiable at 0, we require that (A'w)j = 
A/w = w'Aj lie in the subgradient set of J(Xj) at 0, which 
is the set [—1,1] flUJ. In summary, we need a w that satisfies 



w'Aj = if j G T, w'Aj = sgn(xj) if j G A, and 



/A,-| < 1, if j i TU A 



(ID 



LemmaQJbelow shows that by using dTTb but with |u/A,-| < 1 
replaced by |u/Aj| < 1 for all j ^ TU A, we get a set of 
sufficient conditions for x to be the unique solution of ((6]). 

Lemma 1: The sparse signal, x, with support as defined in 
Theorem Q] and with y := Ax, is the unique minimizer of (O 
if Sk+u < 1 and if we can find a vector w satisfying 

1) w'Aj = if j G T 

2) -u/A, = sgn(zj) if j G A 

3) \w'Aj\ < 1, if j g TU A 
Recall that fc = |T| and u = |A|. 

The proof is given in the next subsection. 

Next we give Lemma|2] which constructs a w which satisfies 
At'w = and AtJw — c for any set Td disjoint with T of 
size \Td\ < S and for any given vector c of size \Tj\. It also 
bounds \A/w\ for all j g T U T d U E where E is called an 
"exceptional set". We prove Theorem Q] by applying Lemma 
|2] iteratively to construct a w that satisfies the conditions of 
Lemma [TJ under the assumptions of Theorem [TJ 

Lemma 2: Given the known part of the support, T, of size 
k. Let S, S be such that k + S + S < n and 5 s + 5 k + 9l s < 1. 
Let c be a vector supported on a set T^, that is disjoint with 
T, of size \TA < 5. Then there exists a vector u> and an 
exceptional set, E, disjoint with T U Td, s.t. 



Aj'w = 0, VjeT 
V j G T d 



(12) 



|£|<S 
||^'*|| 2 < a fe (5,5)||c|| 2 

\A/w\< ak ^ S) \\c\\ 2 \fj ^TUTdLiE and 
||«>||2 < K k (S)\\c\\ 2 
where a k (S, S) is defined in ([8j and 



if fc (S) := 



VT+Ss 



(13) 



(14) 



The proof is given in the next subsection. 

Proof Outline of Theorem \T\ To prove Theorem [TJ apply 
Lemma [2] iteratively, in a fashion similar to that of the proof 
of J9] Lemma 2.2] (this proof had some important typos). The 



main idea is as follows. At iteration zero, apply Lemma [2] 
with T d = A (so that S = u), Cj = sga(xj) V j G A, and 
S = u, to get a Wi and an exceptional set i, of size less 
than u, that satisfy the above conditions. At iteration r > 0, 
apply Lemma |2] with T d = A U T d , r (so that 5 = 2u), Cj = 
V j G A, Cj = Aj'w r V j £ Xd. r and S* = u to get a m r+ i 
and an exceptional set Td, r +\, of size less than u. Lemma 
|2] is applicable in the above fashion because condition [T] of 
Theorem [TJ holds. Define w := 2~2'rLi(~ l) r_1 w r . We then 
argue that if condition [2] of Theorem [TJ holds, w satisfies the 
conditions of LemmaQ] Applying Lemma[TJ the result follows. 
We give the entire proof in the Appendix. 

D. Proofs of Lemmas [JJ and [2] 

We prove the lemmas from the previous subsection here. 
Recall that k = |T| and u = |A|. 

1) Proof of Lemma\J] The proof is motivated by [9, Section 
II- A]. There is clearly at least one element in the feasible set 
of © - x - and hence there will be at least one minimizer 
of ©. Let (3 be a minimizer of (0. We need to prove that 
if the conditions of the lemma hold, it is equal to x. For any 
minimizer, /?, 

W)t4i<\\(x)tA\i:=Y.\ x i\ < 15 > 
is a 

Recall that x is zero outside of T U A, T and A are disjoint, 
and x is always nonzero on the set A. Take a w that satisfies 
the three conditions of the lemma. Then, 

ll(/?)T*||l = 5> i + (&-^)|+ £ 

> E s § n ( x i)( x J + (ft" ~ x j)) + w ' A J'ft 

= Z)i^i + S^jC8i-*i)+ E w '^-ft 

ieA jeA j^TuA 

+J2 w ' A ^-^) 



(16) 



= ||x T c||i+ W '(A/3-Ax) = Hx-Tcjli 

Now, the only way ( TToT l and (TT3T > can hold simultaneously 
is if all inequalities in ( TToT l are actually equalities. Consider 
the first inequality. Since is strictly less than 1 for all 

j ^ T U A, the only way 2~2jaruA 
is if /3j = for all j ^ TU A. 

Since both /3 and x solve ©, j/ = Ax = Af3. Since /3j = 
= Xj for all j ^ TUA, this means that y = Atua(/3)tua = 
Atua(x)tuA or that A T ua((/3)tuA ~ (x)tua) = 0. Since 
5k+ u < 1, A^ua is full rank and so the only way this can 
happen is if (/3)tua = (x)tuA- Thus any minimizer, f3 = x, 
i.e. x is the unique minimizer of 

2) Proof of Lemma [2} The proof of this lemma is signifi- 
cantly different from that of the corresponding lemma in J9), 
even though the form of the final result is similar. 

Any w that satisfies At'w = will be of the form 



w 



[I - A T (A T ' 'At)' 1 A T ']-y := Mj 



(17) 
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We need to find a 7 s.t. A 

Let 7 = M'AT d r). Then 77 



c, i.e. 



A Td 'M-y 



(A Td 'MA Td )~ 1 c. This follows because MM' 



M 



since M is a projection matrix. Thus, 

* = MM' At^ (AtJMAtJ^c = MA Td {A Td 'MA Td ) 
Consider any set f d with \ f d \ < S disjoint with TUT d . Then 



IV. Comparison of CS and Modified-CS 

In Theorem Q] and Corollary [T] we derived sufficient condi- 
tions for exact reconstruction using modified-CS. In Sec lIV-Al 
we compare the sufficient conditions for modified-CS with 
^g^those for CS. In Sec. IIV-BI we use Monte Carlo to compare 
the probabilities of exact reconstruction for both methods. 



\\A fd 'w\\ 2 <\\A td 'MA Td \\ UAt/MAtJ-'W ||c|| 2 (19) 
Consider the first term from the right hand side (RHS) of dT9l >. 

\\A fd 'MA Td \\<\\A t jA Td \\ 



< 



9 



S.k 



s,s 



1 



vWAfjAriAr'AT^Ar'A^W 

PC 1 t. 

(20) 
Since 



Consider the second term from the RHS of ( fT9] l. 
AtJ ' M Ar d is non-negative definite, 



UAtJMAt^W = 



1 



XmUA Td 'MA Td ) 



(21) 



Now, A Td 'MA Td = A Td A Td - A Td ' A T (A T ' At)' 1 A T ' A Td 
which is the difference of two symmetric non-negative definite 
matrices. Let B\ denote the first matrix and B 2 the second one. 
Use the fact that A min (Bi - B 2 ) > A min (B 1 ) + A min (-B 2 ) = 
A m in(Bi) - A max (B 2 ) where A min (.), A min (.) denote the min- 
imum, maximum eigenvalue. Since A m i n (i?i) > (1 — 8s) and 
<\(a Th 'a t )\\ 2 ^ e\ ik 



A m ax(-B 2 ) = H-B2U < 



1-Sk 



< 



\\(A Td 'MA Td y 



< 



1 



thus 



(22) 



l-Ss- 



as long as the denominator is positive. It is positive because 
we have assumed that Ss + Sk + Of s < 1- Using (l20l i and (l22l 
to bound ([19), we get that for any set f d with \ f d \ < S, 



< 



's,s 



l-Sk 



C 2 



1-Ss- 



ak(S,S)\\c\\ 



(23) 



where dk(S, S) is defined in (©. Notice that a,k(S, S) is non- 
decreasing in k, S, S. Define an exceptional set, E, as 



{j e(TUT d r:\A/w\>^^l\\ch} 



(24) 



Notice that \E\ must obey \E\ < S since otherwise we can 
contradict (fill) by taking f d C E. 

Since \E\ < S and E is disjoint with T U T d , d23]l holds for 
f d = E, i.e. ||^4b'w||2 < a k (S, S)\\c\\ 2 - Also, by definition of 



E, \A, 



< 



C 2. 



for all j ^ T U T d U E. Finally, 



|^!| 2 < I MAT d (Ax d ' MA? d ) _ 

< ||M|| ||A T J| \\{A Td 'MA Td ) 



C\\2 

1 1 



< 



1-Ss- 



■||c|| 2 = iif fc (5)||c|| 2 



(25) 



l-Sk 



since ||M|| = 1 (holds because M is a projection matrix). 
Thus, all equations of O hold. Using (HBJ, O holds. ■ 



A. Comparing sufficient conditions 

We compare the sufficient conditions for modified-CS and 
for CS, expressed only in terms of Ss's. Sufficient conditions 
for an algorithm serve as a designer's tool to decide the number 
of measurements needed for it and in that sense comparing the 
two sufficient conditions is meaningful. 

For modified-CS, from Corollary Q] the sufficient condition 
in terms of only S s 's is 2S 2u + S 3u + S k + $l +u - 
Using k = s + e — u, this becomes 

2<5 2ti + S 3u + S s+e - u + Sg, 



■2^ +2 „<l. 



2Si +e+u <l. 



(26) 



For CS, two of the best (weakest) sufficient conditions that 
use only <5s's are given in fl22l . l23l and ifTTl . Between these 
two, it is not obvious which one is weaker. Using |22| and 
IfTTl . CS achieves exact reconstruction if either 



»2.s 



< 



V2 - 1 or 82 



8 3s < 1. 



(27) 



To compare d26b and d27] i. we use u = e = 0.02s which 
is typical for time series applications (see Fig. [TJ. One way 
to compare them is to use S cr < c8 2r l24l Corollary 3.4] to 
get the LHS's of both in terms of a scalar multiple of S 2u . 
Thus, d26j> holds if 8 s+e+u < 1/2 and 8 2u < 1/132.5. Since 
8 s +e+u = 8$ 2u < 528 2u , the second condition implies the 
first, and so only 8 2u < 1/132.5 is sufficient. On the other 
hand, d27l i holds if 8 2u < 1/241.5 which is clearly stronger. 

Alternatively, we can compare (|26| | and (|27T i using the high 
probability upper bounds on 83 as in J9). Using ||9] Eq 3.22], 
for an m x n random Gaussian matrix, with high probability 
(w.h.p.), 8 S < flW/ m (~ ), where 

, where 




and binary entropy H(r) :— — rlogr — (1 — r)log(l — r) 
for < r < 1. Thus, w.h.p., modified-CS achieves exact 
reconstruction from random-Gaussian measurements if 



PmodCS 



+9n/r, 



9n/r 



2g n /r. 



- 9n/r 



< 1. 



(28) 



Similarly, from ( f27l ). w.h.p., CS achieves exact reconstruction 
from random-Gaussian measurements if either 



PCS ■= 9n/m 
PCS,1 '■= 9n/i 



+ 9n/r, 



— < 1 or 



^)<^-l. 



(29) 
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Modified-CS (u=e=s/50) 



(a) Plots of pes defined in 



(b) Plots of pes, 2 defined in j29\ 



T 1 




(c) Plots of pmodcs defined in 



Fig. 2. Plots of pes and pcs,i (in (a) and (b)) and p mo dCS (in (c)) against s/n for 3 different values of m/n. For p m odCS, we used 
u — e — s/50. Notice that, for any given m/n, the maximum allowed sparsity, s/n, for p mo dCS < 1 is larger than that for which either 
Pes < 1 or pcs,2 < V% — 1. Also, both are much smaller than what is observed in simulations. 



In Fig. |2 we plot p C s, Pcs,2 and p mo dcs against s/n 
for three different choices of m/n. For p m odcs, we use 
u = e = 0.02s (from Fig. [TJ. As can be seen, the maximum 
allowed sparsity, i.e. the maximum allowed value of s/n, 
for which either pes < 1 or pcs,2 < V% — 1 is smaller 
than that for which p m odcs < 1- Thus, for a given number 
of measurements, to, w.h.p., modified-CS will give exact re- 
construction from random-Gaussian measurements, for larger 
sparsity sizes, s/n, than CS would. As also noted in j5J, in all 
cases, the maximum allowed s/n is much smaller than what 
is observed in simulations, because of the looseness of the 
bounds. For the same reason, the difference between CS and 
modified-CS is also not as significant. 

B. Comparison using Monte Carlo 

So far we only compared sufficient conditions. The actual 
allowed s for CS may be much larger. To actually compare 
exact reconstruction ability of modified-CS with that of CS, 
we thus need Monte Carlo. We use the following procedure 
to obtain a Monte Carlo estimate of the probability of exact 
reconstruction using CS and modified-CS, for a given A (i.e. 
we average over the joint distribution of x and y given A). 

1) Fix signal length, n = 256 and its support size, s — 
O.ln = 26. Select to, u and e. 

2) Generate the to x n random-Gaussian matrix, A (gen- 
erate an m x n matrix with independent identically 
distributed (i.i.d.) zero mean Gaussian entries and nor- 
malize each column to unit £2 normfl 

3) Repeat the following tot = 500 times 

a) Generate the support, N, of size s, uniformly at 
random from [l,n]. 

b) Generate (x) N ~ Af(0, 1007). Set (x) N c = 0. 

c) Set y :— Ax. 

d) Generate A of size u uniformly at random from 
the elements of N. 

e) Generate A e of size e, uniformly at random from 
the elements of [1, n] \ N. 

'As pointed out by an anonymous reviewer, we actually do not need to 
normalize each column to unit norm. As proved in 1251 . a matrix with i.i.d. 
zero mean Gaussian entries with variance l/n will itself satisfy the RIP. If 
the variance is not l/n, there will just be a scaling factor in the RIP. This 
does not affect reconstruction performance in any way. 



f) Let T = N U A e \ A. Run modified-CS, i.e. solve 
©). Call the output x modC s- 

g) Run CS, i.e. solve © with T being the empty set. 
Call the output xcs- 

4) Estimate the probability of exact reconstruction using 
modified-CS by counting the number of times x mo dcs 
was equal to x ("equal" was defined as \\x mo( ics ~ 
x h/\\xh < 1CT 5 ) and dividing by tot = 500. 

5) Do the same for CS using xcs- 

6) Repeat for various values of to, u and e. 

We set n = 256 and s = 0.1?i and we varied m between 
0.16n = 1.6s and OAn = 4s. For each m, we varied u = 
|A| between 0.04s to s and e = |A e | between to 0.4s. 
We tabulate our results in Table U The case u = s and e = 
corresponds to CS. Notice that when m is just 0.19n = 
1.9s < 2s, modified-CS achieves exact reconstruction more 
than 99.8% of the times if u < 0.08s and e < 0.08s. In this 
case, CS has zero probability of exact reconstruction. With 
to = 0.3n = 3s, CS has a very small (14%) chance of exact 
reconstruction. On the other hand, modified-CS works almost 
all the time for u < 0.2s and e < 0.4s. CS needs at least 
to = 0.4?i = 4s to work reliably. 

The above simulation was done in a fashion similar to that 
of (9). It does not compute the to required for Theorem [TJ to 
hold. Theorem [TJ says that if to is large enough for a given s, 
u, e, so that the two conditions given there hold, modified-CS 
will always work. But all we show above is that (a) for certain 
large enough values of to, the Monte Carlo estimate of the 
probability of exact reconstruction using modified-CS is one 
(probability computed by averaging over the joint distribution 
of x and y); and (b) when u, e <C s, this happens for much 
smaller values of m with modified-CS than with CS. 

As pointed out by an anonymous reviewer, Monte Carlo 
only computes expected values (here, expectation of the indi- 
cator function of the event that exact reconstruction occurs) 
and thus, it ignores the pathological cases which occur with 
zero probability l26l . l27l . In |26l . the authors give a greedy 
pursuit algorithm to find these pathological cases for CS, i.e. 
to find the sparsest vector x for which CS does not give exact 
reconstruction. The support size of this vector then gives an 
upper bound on the sparsity that CS can handle. Developing 
a similar approach for modified-CS is a useful open problem. 
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TABLE I 

Probability of exact reconstruction for modified-CS. Recall that u = |A|, e = |A e | and s = \N\. Notice that u — s 

AND e = CORRESPONDS TO CS. 



(a) m = 0.16n 



(b) m = 0.19n 



e 

u 





0.08s 


0.24s 


0.40s 


e 

u 





0.08s 


0.24s 


0.40s 


0.04s 


0.9980 


0.9900 


0.8680 


0.4100 


0.08s 


0.9980 


0.9980 


0.9540 


0.7700 


0.08s 


0.8880 


0.8040 


0.3820 


0.0580 


0.12s 


0.9700 


0.9540 


0.7800 


0.4360 


s 


(CS) 0.0000 








s 


(CS) 0.0000 









(c) m = 0.25n 



(d) m = 0.30n 



(e) m = 0.40n 



e 

u 





0.08s 


0.24s 


0.40s 


e 

u 





0.08s 


0.24s 


0.40s 


0.04s 


1 


1 


1 


1 


0.04s 


1 


1 


1 


1 


0.20s 


1 


1 


0.9900 


0.9520 


0.20s 


1 


1 


1 


1 


0.35s 


0.9180 


0.8220 


0.6320 


0.3780 


0.35s 


1 


1 


0.9940 


0.9700 


0.50s 


0.4340 


0.3300 


0.1720 


0.0600 


0.50s 


0.9620 


0.9440 


0.8740 


0.6920 


s 


(CS) 0.0020 








s 


(CS) 0.1400 









e 

u 





0.40s 


0.04s 


1 


1 


0.20s 


1 


1 


0.35s 


1 


1 


0.50s 


1 


1 


s 


(CS) 0.9820 





C. Robustness to noise 

Using an anonymous reviewer's suggestion, we studied the 
robustness of modified-CS to measurement noise. Of course 
notice that in this case the true signal, x, does not satisfy the 
data constraint. Thus it is not clear if © will even be feasible. 
A correct way to approach noisy measurements is to relax the 
data constraint as is done for CS in J5] or 11221 . This is done 
for modified-CS in our recent work ||28l and also in l29l . 

In practice though, at least with random Gaussian measure- 
ments and small enough noise, © did turn out to be feasible, 
i.e. we were able find a solution, in all our simulations. We 
used n = 256, s = O.ln, u = e = 0.08s and m = 0.19n. 
We ran the simulation as in step [3] of the previous subsection 
with the following change. The measurements were generated 
as y := Ax + w where w ~ A/"(0, <x^I). We varied er^ 
and compared the normalized root mean squared error (N- 
RMSE) of modifie d-CS with that of CS i n Table UH N-RMSE 
is computed as ^/E[||a; — r£ 1 1 ^ ] /IE [ 1 1 1 1 ^ ] where E[.] denotes 
the expected value computed using Monte Carlo. Recall that 
a; 7v ~ A/"(0, 1007). When the noise is small enough, modified- 
CS has small error. CS has large error in all cases since m is 
too small for it. 

TABLE II 

Reconstruction error (N-RMSE) from noisy measurements. 



< 


0.001 


0.01 


0.1 


1 


10 


CS 


0.7059 


0.7011 


0.7243 


0.8065 


1.1531 


Modified-CS 


0.0366 


0.0635 


0.1958 


0.5179 


1.3794 



V. Extensions of Modified-CS 

We now discuss some key extensions - dynamic modified- 
CS, regularized modified-CS (RegModCS) and dynamic Reg- 
ModCS. RegModCS is useful when exact reconstruction does 
not occur - either m is too small for exact reconstruction or the 
signal is compressible. The dynamic versions are for recursive 
reconstruction of a time sequence of sparse signals. 

Before going further we define the 6%-energy support. 

Definition 1 (b%-energy support or b%-support): For 
sparse signals, clearly the support is N := {i E 
[l,n] : x\ > 0}. For compressible signals, we misuse 
notation slightly and let N be the b%-energy support, i.e. 



N := {i € [1, n] : x\ > (}, where £ is the largest real number 
for which N contains at least b% of the signal energy, e.g. 
b = 99 in Fig. Q] 



A. Dynamic Modified-CS: Modified-CS for Recursive Recon- 
struction of Signal Sequences 

The most important application of modified-CS is for re- 
cursive reconstruction of time sequences of sparse or com- 
pressible signals. To apply it to time sequences, at each time 
t, we solve © with T = Nt-i where Nt-i is the support 
estimate from t — 1 and is computed using (|7]). At t = we 
can either initialize with CS, i.e. set T to be the empty set, 
or with modified-CS with T being the support available from 
prior knowledge, e.g. for wavelet sparse images, T could be 
the set of indices of the approximation coefficients. The prior 
knowledge is usually not very accurate and thus at t = one 
will usually need more measurements i.e. one will need to use 
yo = AoXq where Aq is an mo x n measurement matrix with 
mo > m. The full algorithm is summarized in Algorithm [T] 

Threshold Selection. If m is large enough for exact 
reconstruction, the support estimation threshold, a, can be set 
to zero. In case of very accurate reconstruction, if we set a to 
be equal/slightly smaller than the magnitude of the smallest 
element of the support, it will ensure zero misses and fewest 
false additions. As m is reduced further (error increases), a 
should be increased further to prevent too many false additions. 
For compressible signals, one should do the above but with 
"support" replaced by the 6%-support. For a given m, b should 
be chosen to be just large enough so that the elements of the 
6%-support can be exactly reconstructed. 

Alternatively, one can use the approach proposed in lfl3l 
Section II]. First, only detect additions to the support using a 
small threshold (or keep adding largest elements into T and 
stop when the condition number of At becomes too large); 
then compute an LS estimate on that support and then use 
this LS estimate to perform support deletion, typically, using a 
larger threshold. If there are few misses in the support addition 
step, the LS estimate will have lower error than the output of 
modified-CS, thus making deletion more accurate. 



X 



Algorithm 1 Dynamic Modified-CS 



At t 

rump ||(/3)t 



0, compute xq as the solution of 
i, s.t. j/o = Aq(3, where T is either 
empty or is available from prior knowledge. Compute 
N = {i e [l,n] : (x )i > a}. For t > 0, do 

1) Modified-CS. Let T = N t -i. Compute x t as the 
solution of mmp ||(/?)t c ||i, s.t. yt = A/3. 

2) Estimate the Support. N t = {i G [ljfi] : (^t)i > a }- 

3) Output the reconstruction i t . 
Feedback JV* , increment t, and go to stepQ] 



table iii 

Comparing probability of exact reconstruction (prob) 
and reconstruction error (error) of regmodcs with 
different 7's. 7 = corresponds to modified-cs. 



(a) m = 0.16n 



7 


(modCS) 


0.001 


0.05 


0.1 


0.5 


1 


prob 


0.76 


0.76 


0.74 


0.74 


0.70 


0.34 


error 


0.0484 


0.0469 


0.0421 


0.0350 


0.0273 


0.0286 



(b) m = 0.12n 



(c) m = O.lln 



7 


(modCS) 


1 


prob 


0.04 





error 


0.2027 


0.0791 



7 


(modCS) 


1 


prob 








error 


0.3783 


0.0965 



B. RegModCS: Regularized Modified-CS 

So far we only used prior knowledge about the support to 
reduce the m required for exact reconstruction or to reduce 
the error in cases where exact reconstruction is not possible. 
If we also know something about how the signal along T 
was generated, e.g. we know that the elements of Xt were 
generated from some distribution with mean p,j<> we can use 
this knowledg^l to reduce the reconstruction error by solving 

mm||(/3) T c|| 1 + 7 ||(/3) T -Hll s.t. y = Aj3 (30) 

We call the above Regularized Modified-CS or RegModCS. 
Denote its output by x reg . 

We ran a Monte Carlo simulation to compare Modified-CS 
with RegModCS for sparse signals. We fixed n = 256, s = 
26 « O.ln, u = e = 0.08s. We used m = 0.16ra, 0.12n, O.lln 
in three sets of simulations done in a fashion similar to that 
of Sec. IIV-BI but with the following change. In each run of a 
simulation, we generated each element of Pn\A to be i.i.d. ±1 
with probability (w.p.) 1/2 and each element of p& and of ua c 
to be i.i.d. ±0.25 w.p. 1/2. We generated x N ~ 7V(/xjv, 0.017) 
and we set x^c =0. We set y := Ax. We tested RegModCS 
with various values of 7 (7 = corresponds to modified- 
CS). We used tot = 50. The results are tabulated in Table 
UlTT We computed the exact reconstruction probability as in 
Sec. lIV-Bl bv counting the number of times x reg equals x and 
normalizing. As can be seen, RegModCS does not improve the 
exact reconstruction probability, in fact it can reduce it. This is 
primarily because the elements of (x reg )A B are often nonzero, 
though smal0. But, it significantly reduces the reconstruction 
error, particularly when m is small. 

C. Setting 7 using an MAP interpretation of RegModCS 

One way to select 7 is to interpret the solution of (f30b as 
a maximum a posteriori (MAP) estimate under the following 
prior model and under the observation model of (0]). Given the 
prior support and signal estimates, T and fix, assume that xt 
and xt" are mutually independent and 

p(x T \T,p T ) =Af(x T ;HT,<rll), 

f 1 \ ' TC ' II ^T^ 111 

p{x T c\T,n T ) = \^—j e 5 , (31) 

2 Because of error in T, this knowledge is also not completely correct. 

3 But if we use x reg to first estimate the support using a small threshold, 
a, and then estimate the signal as A^*y, this probability does not decrease 
as much and in fact it even increases when m is smaller. 



i.e. all elements of x are mutually independent; each element 
of T c is zero mean Laplace distributed with parameter b p ; and 
the i th element of T is Gaussian with mean pi and variance 
CTp. Under the above model, if 7 = b p /2o-p in ( f30b , then, 
clearly, its solution, x reg , will be an MAP solution. 

Given i.i.d. training data, the maximum likelihood estimate 
(MLE) of b p , Up can be easily computed in closed form. 

D. Dynamic Regularized Modified-CS (RegModCS) 

To apply RegModCS to time sequences, we solve (f30b with 
T = Nt-i and pt = (xt-i)r- Thus, we use Algorithm Q] with 
step Q] replaced by 

mm \\(P)n U Ik +7ll(/% t _ 1 - (xt-i)ft t _J 2 2 s.t. y t = A(3 (32) 

and in the last step of Algorithm Q] we feed back x t and Nt. 

In Appendix [C] we give the conditions under which the 
solution of (l32l becomes a causal MAP estimate. To summa- 
rize that discussion, if we set 7 = b p /2a^ where & p ,<r? are 
the parameters of the signal model given in Appendix and 
if we assume that the previous signal is perfectly estimated 
from yo,---Vt-i with the estimate being zero outside Nt-i 
and equal to (x t -i)ft ± on it, then the solution of d32l will 
be the causal MAP solution under that model. 

In practice, the model parameters are usually not known. 
But, if we have a training time sequence of signals, we can 
compute their MLEs using (l42l , also given in Appendix [C] 

vi. reconstructing sparsified/true images from 
Simulated Measurements 

We simulated two applications: CS-based image/video com- 
pression (or single-pixel camera imaging) and static/dynamic 
MRI. The measurement matrix was A = i7$ where $ is the 
sparsity basis of the image and 77 models the measurement 
acquisition. All operations are explained by rewriting the 
image as a ID vector. We used $ = W' where W is an 
orthonormal matrix corresponding to a 2D-DWT for a 2-level 
Daubechies-4 wavelet. For video compression (or single-pixel 
imaging), H is a random Gaussian matrix, denoted G r , (i.i.d. 
zero mean Gaussian m x n matrix with columns normalized 
to unit £2 norm). For MRI, 77 is a partial Fourier matrix, i.e. 
H = AIF where M is an m x n mask which contains a single 
1 at a different randomly selected location in each row and all 
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other entries are zero and F is the matrix corresponding to 
the 2D discrete Fourier transform (DFT). 

N-RMSE, defined here as \\x t — x t l^/H^tlhi is used to 
compare the reconstruction performance. We first used the 
sparsified and then the true image and then did the same for 
image sequences. In all cases, the image was sparsified by 
computing its 2D-DWT, retaining the coefficients from the 
99% -energy support while setting others to zero and taking 
the inverse DWT. We used the 2-level Daubechies-4 2D- 
DWT as the sparsifying basis. We compare modified-CS and 
RegModCS with simple CS, CS-diff flS) and LS-CS lfT3l . 

For solving the minimization problems given in (O and 
d3Qb . we used CVX, http://www.stanford.edu/~boyd/cvx/ for 
smaller sized problems (n < 4096). All simulations of Sec.lIVI 
and all results of Table|IV|and Figs. [3]Hused CVX. For bigger 
signals/images, (i) the size of the matrix A becomes too large 
to store on a PC (needed by most existing solvers including 
the ones in CVX) and (ii) direct matrix multiplications take 
too much time. For bigger images and structured matrices 
like DFT times DWT, we wrote our own solver for (O by 
using a modification of the code in LI Magic 1301 . We show 
results using this code on a 256 x 256 larynx image sequence 
(n = 65536) in Fig. [5] This code used the operator form of 
primal-dual interior point method. With this, one only needs to 
store the sampling mask which takes 0(n) bits of storage and 
one uses FFT and fast DWT to perform matrix-vector multipli- 
cations in O(nlogn) time instead of 0(n 2 ) time. In fact for a 
6x6 image the cost difference is 0(b 2 log 6) versus 0(6 4 ). All 
our code, for both small and large problems, is posted online at 
http://www.ece.iastate.edu/~namrata/SequentialCS.html This 
page also links to more experimental results. 

A. Sparsified and True (Compressible) Single Image 

We first evaluated the single image reconstruction problem 
for a sparsified image. The image used was a 32 x 32 
cardiac image (obtained by decimating the full 128 x 128 
cardiac image shown in Fig. Q]), i.e. n = 1024. Its support 
size s = 107 ss O.ln. We used the set of indices of the 
approximation coefficients as the known part of the support, 
T. Thus, k = \T\ = 64 and so u = |A| > 43 which is 
a significantly large fraction of s. We compare the N-RMSE 
in Table [IV] Even with such a large unknown support size, 
modified-CS achieved exact reconstruction from 29% random 
Gaussian and 19% partial Fourier measurements. CS error in 
these cases was 34% and 13% respectively. 

We also did a comparison for actual cardiac and larynx 
images (which are only approximately sparse). The results 
are tabulated in Table [TV] Modified-CS works better than CS, 
though not by much since |A| is a large fraction of \N\. Here 
N refers to the 6% support for any large 6, e.g. 6 = 99. 

B. Sparsified Image Sequences 

We compared modified-CS with simple CS (CS at each time 
instant), CS-diff and LS-CS (13) for the sparsified 32 x 32 
cardiac sequence in Fig. [3] Modified-CS was implemented as 
in Algorithm [T] At t = 0, the set T was empty and we used 
50% measurements. For this sequence, 1 7V f | ps O.ln = 107, 
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(a) H = G T , m =0.5n, m=0.16n (b) H = MF, m =0.5n, m=0.16n 

Fig. 3. Reconstructing the sparsified 32 x 32 cardiac image sequence, 
s « O.ln, u « O.Oln, e « 0.005n. (a) H = G r , (b) H = MF. 
Similar results were also obtained for the larynx sequence. These are 
shown in J2] Fig. 3] (not repeated here due to lack of space). 

TABLE IV 

Reconstruction Error (N-RMSE) 





Sparsified 


True 


True 




Cardiac 


Cardiac 


Larynx 


CS (H = G r ,m = 0.29n ) 


0.34 


0.36 


0.090 


Mod-CS (H = G r ,m = 0.29n) 





0.14 


0.033 


CS (H = MF, m = 0.19n.) 


0.13 


0.12 


0.097 


Mod-CS (H = MF, m = 0.19n) 





0.11 


0.025 



u = |A| < 10 ps O.Oln and e = |A e | < 5 ps 0.005n. 
Since u <C \N t \ and e <C |iV t |, modified-CS achieves exact 
reconstruction with as few as 16% measurements at t > 0. Fig. 
|3(a)| used H = G r (compression/single-pixel imaging) and 
Fig. [3(b)] used H = MF (MRI). As can be seen, simple CS 
has very large error. CS-diff and LS-CS also have significantly 
nonzero error since the exact sparsity size of both the signal 
difference and the signal residual is equal to/larger than the 
signal's sparsity size. Modified-CS error is 10~ 8 or less (exact 
for numerical implementation). Similar conclusions were also 
obtained for the sparsified larynx sequence, see (2] Fig. 3]. 
This is not repeated here due to lack of space. 

C. True (Compressible) Image Sequences 

Finally we did the comparison for actual image sequences 
which are only compressible. We show results on the larynx 
(vocal tract) image sequence of Fig. Q] For Fig. @] we used a 
32 x 32 block of it with random Gaussian measurements. For 
Fig.[5]we used the entire 256x256 image sequence with partial 
Fourier measurements. At t = 0, modified-CS, RegModCS and 
LS-CS used T to be the set of indices of the approximation 
coefficients. 

For the subfigures in Fig. |4] we used H = G r (random 
Gaussian) and mo = 0.19n. Fig. |4(a)| and |4(b)| used m = 
0.19n,0.06n respectively. At each t, RegModCS-MAP solved 
d32l with bp, cr 2 estimated using d42l from a few frames of the 
sequence treated as training data. The resulting 7 = b p /2a 2 
was 0.007. RegModCS -exp-opt solved ® with T = N t -i, 
/.it = {x r eg,t-i)T and we experimented with many values of 
7 and chose the one which gave the smallest error. Notice 
from Fig. |4(a)| that RegModCS -MAP gives MSEs which are 
very close to those of RegModCS-exp-opt. 

Fig.[5]shows reconstruction of the full larynx sequence using 
H = MF, m = 0.19n and three choices of uiq. In |5(a)| we 
compare the reconstructed image sequence using modified-CS 
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(a) H=G r , m =0.19n, m=0.19n (b) H=G r , m =0.19n, m=0.06n 

Fig. 4. Reconstructing a 32 x 32 block of the actual (compressible) 
larynx sequence from random Gaussian measurements, n = 1024, 
99%-energy support size, s fts 0.07n, u m O.OOln and e ~ 0.002n. 
Modified-CS used a = 50 2 when m = 0.19n and increased it to 
a = 80 2 when m = 0.06n. 



with that using simple CS. The error (N-RMSE) was 8-11% 
for CS, while it was stable at 2% or lesser for modified-CS. 
Since mo is large enough for CS to work, the N-RMSE of CS- 
diff (not shown) also started at a small value of 2% for the 
first few frames, but kept increasing slowly over time. In |5(b)| 
|5(c)| we show N-RMSE comparisons with simple CS, CS-diff 
and LS-CS. In the plot shown, the LS-CS error is close to 
that of modified-CS because we implemented LS estimation 
using conjugate gradient and did not allow the solution to 
converge (forcibly ran it with a reduced number of iterations). 
Without this tweeking, LS-CS error was much higher, since 
the computed initial LS estimate itself was inaccurate. 



Original sequence 



[ 



□ 



CS-reconstructed sequence 




Modified CS reconstructed sequence 



(a) Reconstructed sequence. H=MF. m=0.19n, mo=0.5n. 




a iu in 3 iu ia at 

Time-j Time-* 

(b) H=M F, m =0.2n, m=0.19n (c) H=MF, m =0.19n, m=0.19n 

Fig. 5. Reconstructing the 256x256 actual (compressible) vocal tract 
(larynx) image sequence from simulated MRI measurements, i.e. H — 
MF. All three figures used m = 0.1971 for t > but used different 
values of mo- Image size, n = 256 = 65536. 99% energy support, 
|JV t | « 0.07n; u « O.OOln. In Fig. [3(a)] modified-CS used a = 10 2 
which is the smallest magnitude element in the 99% support. 



Notice from both Figs. [4] and [5] that modifiedCS and Reg- 
ModCS significantly outperform CS and CS-diff. In most cases, 
both also outperform LS-CS. RegModCS always outperforms 
all the others, with the difference being largest when m is 
smallest, i.e. in Fig. |4(b)| In Figs.[4]and |5(c)| CS-diff performs 
so poorly, in part, because the initial error at t = is very large 
(since we use only mo = 0.19n). As a result the difference 
signal at t = 1 is not compressible enough, making its error 
large and so on. But even when mo is larger and the initial 
error is small, e.g. in Fig. |5(b)| CS-diff is still the worst and 
its error still increases over time, though more slowly. 

VII. Conclusions and Future Directions 

We studied the problem of reconstructing a sparse signal 
from a limited number of its linear projections when the 
support is partly known (although the known part may contain 
some errors). Denote the known support by T. Modified- 
CS solves an t\ relaxation of the following problem: find 
the signal that is sparsest outside of T and that satisfies the 
data constraint. We derived sufficient conditions for exact 
reconstruction using modified-CS. These are much weaker 
than those for CS when the sizes of the unknown part of the 
support and of errors in the known part are small compared to 
the support size. An important extension, called RegModCS, 
was developed that also uses prior signal estimate knowledge. 
Simulation results showing greatly improved performance of 
modified-CS and RegModCS using both random Gaussian and 
partial Fourier measurements were shown. 

The current work does not bound the error either under 
noisy measurements or for compressible signals or for the 
TV norm. The former is done in l28l . BTIl for modified- 
CS and RegModCS respectively, and, in parallel, also in fl29l 
for modified-CS. A more important question for recursive 
reconstruction of signal sequences from noisy measurements, 
is the stability of the error over time (i.e. how to obtain a 
time-invariant and small bound on the error over time). This 
is studied in ongoing work [32]. The stability of RegModCS 
over time is a much more difficult and currently open question. 
This is due to its dependence on both the previous support and 
the previous signal estimates. 

A key application of our work is for recursive reconstruction 
of time sequences of (approximately) sparse signals, e.g. for 
real-time dynamic MRI. As pointed out by an anonymous 
reviewer, many MRI problems minimize the total variation 
(TV) norm. The modified-CS idea can be applied easily for 
the TV norm as follows. Let T contain the set of pixel indices 
whose spatial gradient magnitude was nonzero at the previous 
time (or should be nonzero based on some other available 
prior knowledge). Minimize the TV norm of the image along 
all pixels not in T subject to the data constraint. Also, by 
designing homotopy methods, similar to those in ifTTl for CS, 
one can efficiently handle sequentially arriving measurements 
and this can be very useful for MRI applications. 

Appendix 

Recall that k = \T\, u = |A|, e = |A e | and s = \N\. 
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A. Proof of Proposition [JJ 



equation of d33l ). simplify to 



The proof follows by contradiction. Suppose that we can 
find two different solutions fa and fa that satisfy y = Afa = 
Afa and have the same 1$ norm, u, along T c . Thus fa is 
nonzero along T (or a subset of it) and some set Ai of size u 
while fa is nonzero along T (or a subset of it) and some set A2 
also of size u. The sets Ai and A2 may or may not overlap. 
Thus A(fa-fa) = 0. Since (fa- fa) is supported on TUAiU 
A 2 , this is equivalent to A TuAl uA 2 (^i - /^tuAiUAs = 0. 
But if 8k+2u < 1, ^4tuAiUA 2 is full rank and so the only way 
this can happen is if fa — fa = 0, i.e fa = fa. 

Therefore there can be only one solution with £0 norm u 
along T c that satisfies that data constraint. Since x is one such 
solution, any other solution has to be equal to x. ■ 



B. Proof of Theorem [JJ 

We construct a w that satisfies the conditions of Lemma 
Q] by applying Lemma [2] iteratively as follows and defining 
w using d37l i below. At iteration zero, we apply Lemma [2] 
with T d = A (so that S = u), Cj = sgn(^ ) V j e A (so 
that ||c||2 = \/u), and with S = u. Lemma [2] can be applied 
because 5 U + 5 k + Q\ u < 1 (follows from condition Q] of the 
theorem). From Lemma[2] there exists a W\ and an exceptional 
set Td i, disjoint with T U A, of size less than S — u, s.t. 

A/wi =0, V j G T 
A/wi = sgn(xj), V j G A 
|T d ,i| < u 
II^T d ,/wi||2 < a k {u,u)\/u 

\A/ Wl \ < a k (u,u), Vj^TuAuT,,,! 
IMIa (33) 

At iteration r, apply Lemma [2] with 7j = A U T4 r (so that 
5 = 2u), ^ = V j G A, Cj = A/to r V j G TV and 5 = 
it. Call the exceptional set XV+i. Lemma [2] can be applied 
because 5 2u + &k + #| 2 < 1 (condition [T] of the theorem). 
From Lemma [2] there exists a w r+ i and an exceptional set 
Td >r +i, disjoint with T U A U TV, of size less than S = u, 
s.t. 



A/w r 
Aj'w r 



V j G T 
.1 = 0, V j G A 
-i = A/uv, V j G 7k r 
|?V +1 | < u 
r+1 'w r+ i\\ 2 < a k (2u,u)\\A Td r 'w r \\ 2 

uv+i| < ^ — \\A Td r w r \\ 2 

V w 

Vj t T U A U IV U TV +1 
IK+1II2 < K k (2u)\\A Tdr 'w r \\ 2 



(34) 



Notice that \Td.i\ < u (at iteration zero) and |TV+i| < u (at 
iteration r) ensures that |A U Td. r \ < S = 2u for all r > 1. 
The last three equations of (1341 1. combined with the fourth 



m^v+i'MV+ilh < a k (2u,u) r a k (u,u)y/u 
\Aj'w r +i\ < a k (2u,u) r a k (u,u), 

Vj i T U A U IV U TV +1 (35) 
||w r +i||2 < K k (2u)a k (2u,u) r ~ 1 a k (u,u)y/u 

(36) 

We can define 



-EM) 



(37) 



Since a k (2u,u) < 1, ||uv|| 2 approaches zero with r, and so 
the above summation is absolutely convergent, i.e. w is well- 
defined. 

From the first two equations of d33l and d34l ). 



A,- 'to = A/u^ = sgn(xj), V j G A 



(38) 



Consider A/w = A/ T,7=i(- 1 ) r ~ 1 ' w r for some j <£ T U 
A. If for a given r, j G TV, then A/w r = A/w r +i (gets 
canceled by the r + l'' 1 term). If j G TV_i, then A/w r = 
Aj w r -i (gets canceled by the r — l*' 1 term). Since TV and 
TV-i are disjoint, j cannot belong to both of them. Thus, 

(-ly^A/wr, Vj^TUA (39) 



E 



Consider a given r in the above summation. Since j £ 
Td, r U 7V_i UTUA, we can use <|35} to get |Aj :'u; r | < 
a fe '(2u,u)'- 1 afc(u,M). Thus, for all j <£ TU A, 



< 



E 

r:jfT d ^UT d ^ 

a k {u,u) 



a k {2u,u) r 1 a k (u,u) 



(40) 

1 - a k (2u,u) 

Since a k (2u, u) + a k (u,u) < 1 (condition [2] of the theorem), 

\A/w\ < 1, V j ^ T U A (41) 

Thus, from d38l and PTt . we have found a u; that satisfies 
the conditions of Lemma [T] From condition Q] of the theorem, 
5fc+u < 1- Applying Lemma [T] the claim follows. ■ 

C. Causal MAP Interpretation of Dynamic RegModCS 

The solution of d32l ) becomes a causal MAP estimate under 
the following assumptions. Let p(X\Y) denote the conditional 
PDF of X of given Y and let S(X) denote the Dirac delta 
function. Assume that 

1) the random processes {x t },{yt} satisfy the hidden 
Markov model property; p(y t \xt) = 5(y t — Axt) (re- 
statement of the observation model); and 

p(x t \x t -i) = p((xt)N t _ 1 \xt-i)p{{x t )N^_ 1 \xt-i), where 
KO^JVt-iK-i) =N({xt)Nt-i \ (a;t-i)jv t _i,cTp/) 



pdx^N^lxt-i) = 



1 

2b, 



exp 
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i.e. given Xt-i (and hence given N t _x), (x t )N t -i and 
(xt)N a _ are conditionally independent; (xt)N t -i is 



Gaussian with mean (xj_i)jv t _i while {xt)N_ 
mean Laplace. 
2) Xt-i is perfectly estimated from yo,yi 



p(x t -i\yo, ■ •■yt-i) = <5 zt-i 



2/t-i 

(zt-x)# t _ 




is zero 



and 



3) it is the solution of (l32l i with 7 = ^j-. 

p 

If the first two assumptions above hold, it is easy to see that 
the "causal posterior" at time t, p(xt\yi, ■ ■ ■ yt), satisfies 

IIC»t)T-C*t-l)Tll2 IKltj^olh 

p(x t \yi, ...y t ) 



CS(y t - Ax t )e 



where T :~ Nt-i and C is the normalizing constant. Clearly, 
the second assumption is only an approximation since it 
assumes that the posterior estimate of x t -\ is exactly sparse. 

If the last assumption also holds, then the solution of (|32] l is 
a maximizer of p(x t \yi, . . .y t ), i.e. it is a causal MAP solution. 

The MLE of b p , <jp can be computed from a training time 
sequence of signals, xq,Xi,X2, ■ ■ .%t max as follows. Denote 
their supports (6%-energy supports in case of compressible 
signal sequences) by Nq,Ni, . ■ ■ Nt max . Then the MLE is 

, Et=rii(^w, Hi 



(Xi-Xt-l)^] 



(42) 



E£riJVt-ii 
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